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This paper describes preconditioned conjugate gradient methods for 
solving sparse symmetric and positive definite systems of linear equations* 
Necessary and sufficient conditions are given for when these preconditioners 
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Introduction 


In this paper we are concerned with the solution of a sparse N x N 
system of symmetric and positive definite linear equations 

Ku - f (l.l) 

by preconditioned conjugate gradient (PCG) methods* For a detailed 
description of these methods see Concus, Golub, O'Leary [1976] and Chandra 
[1978]. 

A A A 

The PCG method solves the system, Kll = i, where 

k = ^ = q\, i = Q-^i, (1.2) 

Q is a nonsingular matrix, and the symmetric and positive definite 
preconditioning matrix is given by M *= QQ • The algorithm for the solution 
of ^ directly is described in Chandra [1978] and is given below where u, r, 

^ m 

r, and p are vectors and (x,y) denotes the inner product x*^y. 

(1) Choose u° 

(2) r° =« f - Ku° 

(3) Mr° = r° 

(4) p° = v° 

(5) k = 0 

(6) For k = 0,1, •••,k 

’ * max 
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( 1 ) 

( 2 ) 

(3) 

(4) 

(5) 

( 6 ) 

(7) 


a = (rSrl^L 


k+1 k ^ k 
u =» u + ctp 


k+1 k 

If Bu*'^ “U I'oo ^ then stop. 


k+1 k _ k 
r = r - oKp 


„;k+l ^ j.k+1 


f^k+1 _k+11 


„k+l ^ :k+l . R„k 
p =* r + pp 


otherwise continue. 


Algorithm 1* Preconditioned Conjugate Gradient Algorithm* 


We note that the standard conjugate gradient algorithm results by choosing 
M = I. 

In the next section preconditioners that are based on taking m steps of 
an Iterative method are described, conditions for their applicability to and 
effectiveness for symmetric and positive definite systems are given, and their 
relationship to the preconditioners of Dubois, Greenbaum, Rodrique [1979] and 
Johnson, Micchelll, and Paul [1982] 1s discussed. In Section 3, the 
Implementation of the m-step SSOR preconditioner on parallel machines is 
discussed and results of this preconditioner on the CYBER 203/205 and the 
Finite Element Machine are Included. 
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2. m-Step PrecondlCloners 

2>1. Choosing M 

Algorithm 1 of the last section requires a symmetric and positive 
definite preconditioning matrix M to be specified or computed* The question 

A 

arises as how to choose M so that the condition number of K, 


k(k) 


raax;^ 

mlnA^ 


9 


where are the eigenvalues of M ^K, Is as small as possible. 

The best choice for M In the sense of minimizing k(k) Is M ■ K but 

A 

this gains nothing since £ Is just as difficult to solve as 

One approach that has been taken in the literature is to choose M to be an 
incomplete Cholesky factorization of K, (Manteuffel [1979])* Another 
approach is to choose M to be a symmetric and positive definite splitting 
of K that describes a linear stationary iterative method (refer to Concus, 
Golub, O'Leary [1976] and the references therein). 

The question of interest here is whether it would be beneficial to take 
more than one step of a linear stationary iterative method to produce a 
preconditioner M that more closely approximates K. We begin by deriving an 
expression for M. Let K = P - Q be a splitting of K that is associated 
with the linear stationary iterative method with iteration matrix G « P 

A 

Then the m-step iterative method applied to Kjr « ^ is 


p( 1-K;+. = [p(l+G+. - (P-Q)]r^°^ + r. (2.1) 


By choosing 



0, (2.1) yields 
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M ° p(l4G+...+G*‘"^)”^. 


( 2 . 2 ) 


Before we establish the necessary and sufficient conditions for M to 
symmetric and positive definite, we prove the following lemma. 

Lemma 1 . 

If A = BC Is a symmetric positive definite matrix, B Is symmetric, 
and C has positive eigenvalues, then B Is positive definite. 


Proof 


,-l 


Let C or equivalently 


A ^Bx = Xx 


(2.3) 


Multiply both sides by A ^2 to get 


( A-‘/2 

BA” ^/2)(a^/2x) = 


(2.4) 


or 


^ 


The proof is now by contradiction. Assume that B has a non-positive 
eigenvalue. Then, since (2.4) is a congruency transformation of B, it 
follows that R has a nonpositive eigenvalue (see Gantmacher 
[1959]). But the spectrum of R is identical to that of and by 
hypothesis can not have a nonpositive eigenvalue. Hence B is 
positive definite. 
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The necessary and sufficient conditions for M to be positive definite are 
given In Theorem 1. 

Theorem 1 « 

Let K = P - Q be a symmetric positive definite matrix and let P be 
a symmetric nonsingular matrix. Then 

(1) the matrix M of (2.2) Is symmetric. 

(2) for m odd, M Is positive definite If and only If P Is 

positive definite. 

(3) for m even, M Is positive definite If and only If P + Q Is 
positive definite. 

Proof 

To prove symmetry, we write as 

= P“^ + P“^QP“^ + P“^QP“^QP“^ + ••• + P~^QP~^Q» « »P~^g p~^ (2.5) 

m -1 terms 

Now since P and K and hence Q are symmetric, each term in (2*5) is 
symmetric# Thus M is symmetric. 

The matrix G = P“^0 can be expressed as G = K ■ ^^2 ( i_K ^^2 p-^K ^2 ] jr V 2 ^ 
Since is symmetric with P, the eigenvalues of the congruence 

transformation K^^2p“^K^^2 are real. Hence, the eigenvalues of G are 
real. 

To prove (2), let m be odd. If g is any eigenvalue of G other 
than 1, the corresponding eigenvalue of 



is 


. . . . m-l 1 - g™ 

1 + g +• • •+ g = _ I ’ 

which Is positive since m Is odd. If g >■ 1, the corresponding 

eigenvalue of R Is equal to m and Is also positive. Now, since 
P •> hr and M Is symmetric and R has positive eigenvalues. It 
follows from Lemma 1 that If P Is positive definite then H must 
also be positive definite. Conversely, M can be written as M «» PR~^. 
Since R~^ has positive eigenvalues and P Is symmetric, we conclude 
from Lemma 1 that If M . Is positive definite then P Is also 

positive definite. 

Next, to prove (3) let m be even. It Is sufficient to consider 
since any conclusions about the definiteness of will apply to 

M. Since m is even, from (2.5) can be written as 

= P“^(P + PG + PG^ + PG^ + ••• + PG“~^)p"^ 
or 

= P"^[(P+PG) + (P+PG)G^ + (P+PG)G^ + ••• + (P+PG)g“"^)]p“^ 

Now, since PG=Q, can be written as 

= p“^(P-H))[i+G^4G^+«*«-H;“"^)p“^. (2.6) 

Since P Is nonsingular and symmetric, 
and only if the symmetric matrix 


is positive definite if 



7 




(2.7) 


Is positive definite. 

Assume P + Q Is positive definite. Since S is symmetric and the 
matrix has positive eigenvalues, S is positive 

definite by Lemma 1. Conversely, if S is positive definite, since 
P + Q is symmetric and the series • •+g“" has positive 

eignevalues, P + Q is positive definite by Lemma 1. 

Dubois, Greenbaum, and Rodrique [1979] considered a truncated Neumann 
series for as a preconditioner • Their preconditioner is equivalent to 

that of (2.2) if K = P - Q corresponds to a Jacobi splitting where P = 
diag(K), but they do not consider more complicated splittings that result from 
other iterative methods. Theorem 1 extends their main result. Under the 
hypothesis that K and P are both symmetric and positive definite matrices 
and p(G) < 1, they prove that M is symmetric and positive definite for 

all m. Note that for odd m the condition that p(G) <1 is not needed. 

The relationship between the condition p(G) < 1 and the positive 

definiteness of P + Q is given in Theorem 2. 

Theorem 2 . 

Let K “ P - Q be a symmetric positive definite matrix and let P be 
symmetric and nonsingular. Then p(p~^q) <1 if and only if P + Q 


is positive definite 
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Proof 

First, assume P + Q Is positive definite. Since K is symmetric 

positive definite and P is nonsingular, K = P - Q is a p-regular 

splitting. Hence, from Ortega's p-regular splitting theorem, Ortega 
[1972], p(p"^q) < 1. 

Now, assume that p(G)<l. Then (I-G)"^ exists and since G has real 
eigenvalues, it easily follows that the matrix H defined by 

H = (I-G)"^(I4<3) (2.8) 

has real eigenvalues* But we know from Young ([1971], p. 82) that H 
is N-stable* Hence H has positive eigenvalues* Now, we can write 
H as 

H = K“^ (P+Q) (2.9) 

or equivalently, 

K = (P+Q)H"^ (2.10) 

Finally, since K is symmetric and positive definite and has 

positive eigenvalues and P+Q is symmetric, we conclude from Lemma 1 
that P+Q is positive definite. 

We note that the Jacobi Convergence Theorem given in Young [1971] is a 
specific case of Theorem 2* 

Theorem 1 and Theorem 2 are helpful in choosing a splitting of K that 
will produce an m-step preconditioner that is symmetric and positive 
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definite. For example, if the Jacobi splitting of K (P = D and Q = D - K 
where D is the diagonal of K) were considered, part (3) of Theorem 1 says 
that if m is even, P + Q must be positive definite, and by Theorem 2 this 
is only true when the Jacobi method is convergent. However, for problems of 
interest to us, the Jacobi method is not guaranteed to be convergent since we 
only know that K will be symmetric and positive definite; therefore, for 
these problems, only odd values of m will yield m-step Jacobi 
preconditioning matrices that are guaranteed to be positive definite. 

2.2. Analysis of the Condition Number 

In the last section, we gave conditions for M to be symmetric and 

positive definite and hence to be considered as a preconditioner for the 

conjugate gradient method. In this section we determine if increasing m 

will, in fact, produce a better conditioned system. For this purpose, we now 

denote by M the matrix of (2.2). 
m 

As a first step towards answering this question, we derive an expression 
for '^(^) • Recall from (1.2) that K is similar to that 

is the same as the ratio of the largest to smallest eigenvalue of 
expression for a polynominal in G is 

M“^K = (l4G+. (P-Q) (2.11) 

or 

M"^K = I - g“ 
m 

where G = P ^Q. 

We wish to compare to » ^en both and ^te 

symmetric and positive definite. By Theorem 1, this implies that P and 
P + Q are positive definite and thus by Theorem 2, P (G) < 1. Under the 
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hypothesis of Theorem 1 the eigenvalues of G are real, and can be 

ordered as 

-1 < X, < X, <•••< X < 1. 

12 n 

Furthermore, let 6 be the eigenvalue with the smallest absolute value. Then 

A 

the condition number of is 

> 0 or < 0 and m odd 

X^ < 0, ® even (2.12) 

X^ < 0, jXjj > ro even 

As can be seen from (2.12), the conditions for *^(^111+1^ ^ depend upon 

the distribution of the eigenvalues X^ of G. We note that for both odd and 
even m if Xj^ < 0 and |^l| ^ |^n(’ impossible to decide whether 

without knowledge of the values of 
conditions for the remaining two cases are stated below: 

If X^ > 0, is a decreasing function for all m. (2.13a) 

If X^ > jX^j and X^ < 0, 

A A 

(a) for m odd, k[k < k[k 1. 

^ mfl-' ^ m-' 

(2.13b) 

(b) for m even, if and only if 

d+|Xj|'^l)(l-X”) < (1-0(1-X^‘). 
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As an application of (2.13a) consider the SSOR splitting of a symmetric 
and positive definite matrix. Recall from the basic convergence theorem for 
SSOR that if K is a symmetric matrix with positive diagonal elements, the 
SSOR method converges if and only if K is positive definite and 0 < u < 2. 
Therefore, p(G) < 1 for this splitting and from Young [1971] we know that all 
the eigenvalues of 6 are real and nonnegative. Since P is symmetric, it 
follows from Theorems 1 and 2 that is symmetric and positive definite and 

A 

from (2el3a) it follows that a decreasing function of m. 

Results of the m-step SSOR preconditioned conjugate gradient method on a 
1536 X 1536 symmetric and positive definite matrix derived from a finite 
element discretization (triangles with linear basis functions) of a plate in 
plane stress are given in Table I and the results on a 768 x 768 matrix 
derived from the 5-star discretization of Laplace^ s equation are given in 
Table II* For these problems, results are given for both the natural rowwise 
ordering and the Multi-color ordering (see Adams and Ortega [1982]) of the 
grid* The convergence criterion was ^ where e = 10 ^ for 

both problems • The conjugate gradient results with no preconditioning are 
indicated by m = 0* 
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Table I. m-step SSOR PCG for 1536 x 1536 Plane Stress Problem 


ra 

R/B/G 

# Iterations 
(0)=1) 

Natural 

if Iterations if 

(0)=1) 

Iterations 

(o)=1.6) 

0 

363 

363 

363 

1 

139 

111 

93 

2 

99 

80 

66 

3 

82 

65 

54 

4 

71 

57 

47 



Table H« ne>step SSOR PCG 

for 768 X 768 

Laplace's Equation 


R/B 


Natural 

jm 

if Iterations 

# Iterations 

# Iterations 


(<*1=1 ) 

((0=1) 

((0=1.8) 

0 

56 

56 

56 

1 

30 

28 

17 

2 

22 

21 

13 

3 

18 

17 

10 

4 

16 

15 

9 


The results in Tables I and II show that the number of iterations is a 
decreasing function of m as was predicted by (2.13a). ' The results also 
indicate that there will be an optimal value of m, say ^opt* since for 
m > 9 the reduction in the number of CG iterations is not enough to 

balance the increase in the time required for the iterations of the SSOR 
preconditioner . The actual relative cost of the CG and SSOR iterations on a 
computer will be a function of the amount of arithmetic and communication 
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operations in each algorithm as well as the times to perform these operations 
on the machine. Therefore, the optimal value of m will depend on the 

architecture of the machine and the problem size as indicated by the results 
in Section 3. 

As an example of an application of (2.13b) we consider the Jacobi 
splitting of any symmetric and positive definite matrix K that has Property 
A (see Young [1971]). For this splitting, P = D where D is the diagonal 
of K and therefore P is symmetric and positive definite. Now, since K 

has Property A, the eigenvalues of G occur in ±X^ pairs and 

X^ = jXj^j and 6 = 0. From (2.13b) we conclude that going from m (even) to 
m + 1 (odd) is advantageous if and only if 

(l+x”^^)(l-x“) < (l-x"^^) 

or equivalently, (2*14) 

. nH*l ^ ^ 

X -2X + 1 > 0. 

n n 

As m increases the inequality in (2*14) reduces asymptotically to 

X <i. (2.15) 

n 2 

For m = 2 and m = 3, the exact conditions are X < .62 and X < .53 
respectively, but for problems of Interest to us, X^ will be closer to 1 

and we can conclude that it is not advantageous to increase m from m 

(even) to m + 1 (odd) • This fact has been verified by numerical experiments 
for the m-step Jacobi preconditioner on an 89 x 89 symmetric and positive 
definite system that had Property A. The results are given in Table III. 
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Table III. vt-step Jacobi Rssults 89 x 89 


JSl 

# Iterations 

0 

45 

1 

45 

2 

23 

3 

36 

4 

21 

5 

30 

6 

18 

7 

26 

8 

16 


Note from Table III that increasing m from 2 to 3, from 4 to 5, and from 
6 to 7 also increases the number of iterations from 23 to 36, from 21 to 
30, and from 18 to 26 respectively. On the other hand, observe that 

increasing m from an odd to a consecutive even number always reduces the 
number of iterations • Dubois, Greenbaum, Rodrique [1979] reported similar 

results for Poisson's equation. Their results may also be explained by 
(2,13b). Also note from Table III that the number of iterations is a 
decreasing function of m if we restrict m to be even. In fact this can 

easily be shown to be true for all three cases in (2.12). 

So far we have only addressed the question of whether a better 
conditioned system results by increasing m. We now turn to the question of 
how much improvement over m = 1 can be made by taking m > 1 steps of the 
preconditioner. Dubois, Greenbaum, and Rodrique [1979] proved that the m- 
step PCG method can only reduce the number of iterations needed by the 1- 
step PCG method by a factor of m. In practice, this theoretical bound may 
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not be reached and for a given distribution of eigenvalues it may be sharper 
for some values of m than for others. The results of Dubois, et.al. [1979] 
show this for the m-step Jacobi PCG for Laplace^s equation. Tables I and II 
show for the m-step SSOR PCG method applied to both the plane stress problem 
and Laplace's equation that the bound is best for m = 2. Table III shows 
that for the m-step Jacobi PCG applied to a problem with Property A that the 
bound is extremely sharp for m = 2 and extremely poor for odd values of m. 

In order to determine the conditions under which the m-step PCG method 


gives the most improvement over the 1-step PCG method, we examine the ratio 


— . A ' - for both odd and even m with different assumptions about the 

distribution of the eigenvalues of G which are assumed to be ordered as 

-1 < Xj^ < X^ <...< 1 with 6 = minjX^j. This ratio can easily be 

calculated from the equations of (2.12) and is summarized below for the 


various cases. 



16 


f. 


1 + X + X +•••+ 
n n n 

1 + Xj^ + Xj +...+ X^^^ 


( 1+1^1 


1 + 


1^ 


m 


k(k^) / (^+pn|°^)(^+pi|) 




( 1+ 1 ^ 1 1 ) ( ^ 

(l-IST) 


V 


(l+|X^|)(l-|XJ°^) 


> 0 


X, < 0,x > 0, m odd 

1 ’ n ’ 




< 0,X^ > Xj^J m even 


(1-XJ(1-|6|“) 

Several observations can be made from (2.16): 


X^ < 0, Xj^ ^ Pn| ® even. 


k(k,) 

(1) If ^i>0, the maximum value of occurs as X, 0 and 

X 1 and is equal to m. (This is the case for the SSOR 
n 

splitting*) 


(2) If Xj^<0 and X^ > 0, and m is odd, the maximum value of 


(^i) , ^1+1^1 1 ^ 

-- occurs when X -►I and is equal to ml * — ^"^1 • 

(o ” i+i^r 


The m-step PCG method (m>l) Is more effective if X >0. 

n 


.16) 


( 3 ) 
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(4) If < 0, and ™ even, the maximum value of 

<(k,) 2m 

. occurs when X -»-l and IX, I = JX | and Is equal to . 

k(k^) n I II I n| j _ ^m 

Note that the larger 5, the larger this ratio will be. Hence to 
achieve the maximum performance in this case, we would like the value 
of 6 to be as close to as possible. For K matrices with 

Property A, this is not possible since 6=0 and the maximum ratio 
of the two condition numbers is 2m. 


In summary, the m-step PCG method gives more Improvement over the 1- 
step PCG method when an even number of steps of the preconditioner 
are taken and the eigenvalues of the matrix G are distributed as 
described in (4) above. This implies that for the SSOR iteration 
matrix which has X^ > 0, the m-step SSOR preconditioner will not be 
extremely effective as m increases. However, by parametrizing this 
preconditioner the method is more effective. This is the topic of 
the next section. 

2.3 Parametrizing the m-step PCG Method 

Johnson , Micchelli , and Paul [1982] have suggested symmetrically scaling 
the matrix K to have unit diagonal and then taking m terms of a 
parametrized Neumann series for K"^ = (I-G)"^ as the value for M This 
corresponds to a symmetric preconditioning matrix that is a polynorainal of 
degree m-1 in G, 

= a„I + a,G + a„G^ +...+ a ,G™“^ (2.17) 
m 0 1 2 m- 1 


derived from the Jacobi splitting. 
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K = I - G 


(2.18) 


of K; hence, the solution to M r = r can be Implemented by taking m steps 

m — 

A 

of the Jacobi iterative method applied to Kr^ = _r with initial guess 


r^°'> = 0 . 


.“ 1 , 


Now, K can be written as a polynominal in K, 


m“^K = [a*I+a, (I-K)+a.(I-K)^+. ..+a ,(I-K)®“^]k (2.19) 

m '* u i z m— i 


m-li 


and Johnson, et.al., choose the so that the eigenvalues of and 
hence those of M^, are positive on the Interval [^p^n^ contains the 
eigenvalues of K and are as close to 1 as possible In some sense such as 
the min-max or the least squares criteria. Clearly, If m = 1, = a^K 


.- 1 . 


and the condition number of M K Is the same for all a_ ^ 0. Hence, we are 

m u 

only Interested in m > 1* 

We now generalize this idea for any splitting of the matrix K, 


K = P - Q, 


( 2 . 20 ) 


If G = then by parametrizing (2.2), the inverse of the m-step 

preconditioner becomes 


= (a-I+a,G+a-G'^+. ..+a ,G™“^)p 

m ^012 m-1 ' 


m-h^-1 


( 2 . 21 ) 


and will be symmetric If P Is symmetric. The expression for is given 


by 






( 2 . 22 ) 
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and Is seen to be a polynominal in P"^K rather than In K as in (2*19) • We 
now choose the values of cx^ so that the eigenvalues of positive 

on the interval [X^ ,X^] that contains the eigenvalues of P“^K and are as 
close to 1 as possible in some sense such as the min-max or least squares 
criteria* 

When the eigenvalues of G are on the interval [0,1), the eigenvalues 
of P"^K are on the interval (0, 1] and from (2*22), in the least squares 

sense, we wish to find the that minimize 

/ (l-x)x+* • *+a^j^ (l-x)°^ ^x - l]^dx. 

The appropriate values of the ot^,i = 0,l,***,m-l for the SSOR splitting are 

given in Adams [1983]* In the next section we discuss the efficient 
implementation of the m-step SSOR preconditioner and the choice for the 
relaxation parameter w for the SSOR method if the grid points are ordered by 
a Multi-color ordering* 


3* Implementation and Results 
3*1* Implementation Considerations 

In order to efficiently Implement the m-step SSOR preconditioner on 
parallel computers, the equations at the grid points of the problem domain 
must be colored, see Adams and Ortega [1982], so that any two equations at 
points on the same grid point stencil are different colors* The equations are 
then ordered by colors with the equations of the same color being ordered left 
to right, top to bottom (for a rectangular grid)* In particular, if three 
colors are used, the system Kr^ ® i the decoupled form. 
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°11 ®12 ®13 


1 

h > 


^1 

T « « 


- 



®12 ®22 ®23 


u\ < 


-2 

_®13 ®23 ®33_ 


L^3j 


_-3_ 


where ^±±9 i “ 1 to 3 are diagonal matrices. 

The m-step SSOR Iteration is implemented as a forward followed by a 
backward Multi-color SOR iteration (Adams and Ortega [1982]) but care is taken 
to save results from the forward pass in an auxilary vector to be used in the 
reverse pass so that the cost of one SSOR iteration is no more expensive than 
the cost of one SOR iteration (Conrad and Wallach [1979]). Specific details 
on this implementation (in conjunction with Algorithm 1) for the CYBER 203 and 
the Finite Element Machine can be found in Adams [1983]. 

In addition to the computational work saved by using the auxilary vector, 
the Maltl-color ordering permits even more savings. To explain this, we begin 
by writing a 3-color SOR Iteration matrix, in the following factored 
form: 



GBR 

U) 0) (U 


(3.2) 


where R ,B , and G are the matrix operators for the Red, Black, and Green 
equations respectively. Nicolaides [1974] discussed the factorization of an 
n n SOR iteration matrix into n operator matrices, one for each 

equation, and then showed how these factors combine for matrices with Property 
A into two factors, corresponding to the red and black equations 

respectively. Young [1971] also gives the factorization of for these 2- 

colored matrices. Equation (3.2) is a straightforward continuation of these 
ideas. To be precise, if the matrix K is given by 
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-Xi2 

-Xi 3 

T 

-^12 

CM 

M 

-X23 

_ T 



i-» 1 

"^23 

^3 


(3.3) 


with no loss In generality by assuming D = I on the diagonal, the 
R.,B , and G,. matrices In (3.2) are 

03 03 03 


and 



‘(i-<«))ij 

0 .X 12 

03 X 13 ’ 



0 

( 

0 

II 

3 

0 

^1 

0 

• ®03 = 

T 

0 )X 

•^12 

(1-03)12 

03 X 23 


0 

0 

"2 


0 

0 

^3 










03 


03X 


12 


0 

0 


03X^3 (l-w)l3 


(3.4) 


respectively. 


Similarly, the backward Multi-color SOR iteration matrix may be written in the 
factored form 



R B G 

03 03 03 


(3.5) 


where R ,G are the same as those of (3.2)* Now, the Multi-color SSOR 

03* 03 * 03 \ > 

iteration matrix may be written as. 



R B G G B R . 

03 03 03 03 03 03 


(3.6) 


A trivial calculation shows that G,G, = . and R,R. = Hence, 

03 03 03(2—03) 03 03 03 (2-03) 
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= R B G _ R . 
( 1 ) to oj to(2-<o) to (0 


(3.7) 


From (3.7), we see that the green equations only need to be calculated on the 
forward pass with relaxation factor to^ = (o(2-(o). Likewise, the 
operators combine from the backward pass and the next forward pass so that the 
red equations should be updated on the first forward pass with relaxation 
factor 03 and on the last backward pass with relaxation factor 03. For the 
intermediate forward passes, the red equations should be updated with 
03^ = 03(2-03). The black equations, however, must be updated on both the 
forward and backward passes with relaxation parameter 03 but part of this 
calculation can be saved by the use of the auxilary vector mentioned 
earlier. By organizing the computation in this fashion, 2m(c-l )+l rather 
than 2mc operation matrices need to be applied. Also, this computational 
organization is not affected by the introduction of a^,i=l,2,*** ,ra since the 
parameter multiplies only the right hand side vector _r on step ra-i+1 

of the preconditioner • 

We now briefly discuss the choice for w. From Young^s [1971] theory of 
matrices with Property A (2-colored) we know that the optimal w for SSOR is 
03 * 1. In fact. Young's proof shows that 


and 






^ ~ B ,, .R =,^ ^ 

0) a)(2-0)) aj(2-w) o)(2-to) 


(3.8) 


(3.9) 


and for matrices with Property A, ^ . has the smallest spectral radius 

03(2-03) 

whenever 03 = 1. In particular, SB Now, for Multi-color 

matrices, is not necessarily similar to SB x 

03 03(2-03) 


since from (3.7) with 3 



23 



Assume that (3*12) represents an equal number of equations of each color and 
let (0 > 1 so that w(2-w) < 1. For two colors, (3.9) shows that all 
equations are under relaxed. For three colors, (3.10) shows that we can regard 
only the black equations as being overrelaxed (once on the forward and once on 
the reverse pass). In general, (3.12) shows that the equations of c-2 
colors can be regarded as overrelaxed and the equations of 2 colors as 
underrelaxed. When the number of colors approaches the number of equations, 
all but two equations can be regarded as being overrelaxed. Although not a 
proof, this observation suggests that overrelaxation becomes more worthwhile 
as the number of colors Increases and choosing oj = 1 when a small number of 
colors is used is a good choice. This was the case for the results in Table 
I, where for the R/B/G ordering of nodes (really six colors — two unknowns 
per node) o) = 1 was optimal for m-step SSOR PCG. Results in Adams [1982] 
show that (0=1 was also optimal for the SSOR method (used alone) for this 


same problem 
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3*2« Results on Parallel Computers 

We now give results of the m-step SSOR PCG method for a square plate in 
plane stress on both the CYBER 203 and the Finite Element Machine* These 
results were discussed in detail in Adams [1983] and are only Included here to 
show that the method is effective on these machines. Table IV gives the 

number of iterations, I, and the time, T, in seconds to solve this problem 
using m = 0, 1,2, 3, 4, 5, 6, 7, 8, 9, and 10* The parametrized preconditioner 
results are denoted by P, the number of rows in the plate by a, and the 
maximum vector length by v. 


Table IV* CYBER 203 Iterations and Timings m-step SSOR FC6 



V 

= 22 

V 

= 41 

V = 

132 

V = 

= 561 

V = 

1282 

V = 

= 2134 


a^ 

= 8 

a 

= 11 


20 


= 41 

a 

CM 

VO 

II 

a 

= 80 


I_ 

L 


L 


L 


L 


L 


L 

0 

112 

.133 

157 

.213 

271 

.565 

536 

3.293 

788 

11.845 

929 

22.780 

1 

52 

.129 

66 

.184 

111 

.454 

214 

2.373 

311 

7.832 

395 

17.194 

2 

38 

.143 

50 

.208 

79 

.478 

152 

2.428 

221 

7.773 

280 

17.380 

2P 

31 

.116 

40 

.167 

61 

.369 

118 

1.885 

172 

6.052 

218 

13.534 

3 

31 

.155 

39 

.216 

65 

.520 

124 

2.585 

181 

8.174 

229 

18.469 

3P 

24 

.121 

30 

.167 

46 

.369 

88 

1.836 

129 

5.828 

163 

13.151 

4P 

22 

.138 

24 

.166 

35 

.350 

67 

1.726 

99 

5.471 

124 

12.306 

5P 

19 

.143 

20 

.167 

29 

.347 

56 

1.716 

82 

5.345 

104 

12.260 

6P 

18 

.159 

18 

.175 

25 

.348 

47 

1.670 

70 , 

5.263 

88 

12.011 

7P 





26 

.413 

43 

1.739 

64 

5.451 

80 

12.410 

8P 





21 

.375 


1.634 

54 

5.139 

69 

11.985 

9P 







33 

1.660 

48 

5.056 

61 

11.731 

lOP 







31 

1.709 

44 

5.070 

55 

11.594 
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We now give the Finite Element Machine results. The same problem with 6 
rows and 6 columns of nodes (60 equations) was solved on a 1, 2, and then on a 
5-processor Finite Element Machine using the m-step SSOR PCG method (as more 
processors become available on this machine the solution of larger problems 
will be possible). Each processor was assigned equations at an equal number 
of R, B, and G nodes. Therefore, in the absence of communication time and 
any differences in processor speeds, a speedup of 2 (5) over the one processor 
case should be realized whenever 2 (5) processors are used respectively. The 
number of iterations, I, and the time, T, in seconds as well as the respective 
speedups are given in Table V. 



Table 

V. FQl Iterations, 

linings. 

Speedups m-step SSOR PCG 


ra 

!_ 

P = 1 

L 


P = 2 
2 

Speedup 

1 

P = 5 
2 

Speedup 

0 

48 

63.35 

49 

33.70 

1.92 

48 

17.70 

3.58 

1 

19 

47.90 

19 

25.85 

1.85 

19 

14.85 

3.23 

2 

13 

48.75 

13 

26.65 

1.83 

13 

15.50 

3.15 

2P 

11 

41.95 

11 

22.95 

1.83 

11 

13.30 

3.15 

3 

11 

54.95 

11 

30.15 

1.82 

11 

17.65 

3.11 

3P 

8 

41.25 

8 

22.75 

1.81 

8 

13.25 

3.11 

4 

10 

62.40 

10 

34.30 

1.82 

10 

20.20 

3.09 

4P 


39.80 


22.00 

1.81 


12.90 

3.09 

5P 

5 

40.60 

5 

22.50 

1.80 

5 

13.25 

3.06 

6P 

5 

47.05 

5 

26.20 

1.80 
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4* Summary and Conclusions 

Preconditioners for a symmetric and positive definite system of linear 
equations based on taking m steps of an iterative method that is derived 
from a symmetric splitting of the coefficient matrix have been described. 
Necessary and sufficient conditions were given for these preconditioners to be 
symmetric and positive definite for both m odd and even in Theorem 1, and 
the relationship between a splitting and its associated iteration matrix was 
given in Theorem 2. 

The m-step SSOR preconditioner was shown to lead to a system whose 
condition number was a decreasing function of m; however, for small problems, 
the actual decrease in the number of iterations is not enough to balance the 
extra work involved in the preconditioner as shown in Tables IV and V. By 
parametrizing this preconditioner, the number of Iterations is reduced enough 
so that larger values of m should be used for smaller problems as well. The 
optimal number of steps of the preconditioner is seen from Tables IV and V to 
be a function of the architecture as well as the problem. The more expensive 
the inner products of the outer CG iteration become, the more likely m 
should be Increased. 

We noted that although a theoretical optimal value of w, the relaxation 
parameter for the SSOR method, can not be found, the choice o) = 1 (when the 
nodes are ordered by the Multi-color ordering) was optimal for our plane 
stress test problem (6 colors) « It is well known that w = 1 is optimal for 
SSOR for matrices that have Young's Property A (Red/Black), but in general 
this theory does not extend beyond two colors. However, we conjectured that 
if the number of colors is small, choosing cj = 1 is a good choice. 

A problem still remains in applying the method to Irregular regions since 
the grid must be colored and for array machines must also be distributed to 
the processors in light of this coloring. 
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